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PACKGROTmn OF THF Tr m F|NTTnr , 

The present invention relates to a virus ca- 
pable of inducing ^adenopathies (hereinafter LAS ) 

10 A LTT —ro.es (hereinafter 

AIDS > to antigens of this virus, particularly in a 
purified form, and to a process for dm h ■ / 
. , process for producing these an- 

tig«n.. Particularly antics of the envelope of this 
"irus. The invention also relates ta „, 
whether glycosylated or . Polypeptides. 
15 to nf)i gIyC ° Sylated or P^duced by the virus and 

to DNA sequences which code for such polypeptides The 
» r r rther rela _tes to cloned DNA sequences hy^ 



— i * t * -^lutaLcs ynr i - 

a o t T 9en0DiC " d ° NA °< "» l^Phadenopathy 

associated virus (hereinafter "LAV" ) of this inventio 
20 =nd to processes for their preparation and their use 



he inventlon stin further reiates tQ ^ ^ 

nludinq a DNA sequence which can be used for the 
defection of the LAV virus of this invention or related 

birr 3 r dna provirus - in *** 

biological, and in samples corn-*ini„„ ^ 
25 ^ 1CS containing any of them. 

An important genetic polymorphism has been re- 
=c 9nlZ ed for the human retrovirus which is the cause of 
IDS and other diseases li*e las, AIDS-related complex 
hereinafter ^ ^ J 

(fox review, see Weiss, Indeed aU Qf ^ x 

tes, analyzed until now, have had distinct restriction 

CB-n et al ., 1985]. identical restriction maps have 
only been observed for the first two isolates w„ic h U e re 
designated LAV [Alizon et al igui 
35 i».„ k „ . etai., 1984] and human T-cell 

lymphotropic virus *vn<» o /u ^ 

et al (hereinafter -HTLV-3") [Hahn 

" al., 198 4] and which appear to be exceptions 
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genetic polymorphism of the AIDS virus was better asses 
sed after the determination of the complete nucleotide 
sequence of LAV [Wain-Hobson et al., 1985], HTLV-3 
. CRatner et al . , 1985 ; „ uesing et al ". # 1985] ^ & ^ 

isolate designated AIDS-associated retrovirus (herein- 
after " ARV2 " ) [Sanchez-Pescador et al 1985] In 
Particular, it appeared that, besides the nucleic acid 
variations responsible for the restriction map polymor- 
phism, isolates could differ significantly at the pro- 
tein level, especially in the envelope (up to 13 * of 
difference between ARV and LAV) , by both amino acids 
substitutions and reciprocal insertions-deletions 
[Rabson and Martin, 1985]. 

Nevertheless, such differences did not go so 
far as to destroy the immunological similarity of such' 
isolates as evidenced by the capabilities of their 
similar proteins, (e.g., core proteins of similar 
nature, such as the P 25 proteins, or similar envelope 
glycoproteins, such as the 110-120 kD glycooroteins ) to 
immunologically cross-react. Accordingly, the proteins 
of any of said LAV viruses can be used for the in vitro, 
detection of antibodies induced j n vivo and present in 
biological fluids obtained from individuals infected 
with the other LAV variants. Therefore, these viruses 
are grouped together as a class of LAV "viruses (herein- 
after "LAV-1 viruses"). 



SUMMARY OF THF T^njTn^ 

In accordance with this invention, a new virus 
has been discovered that is responsible for diseases 
clinically related to AIDS and that can be classified as 
a LAV-1 virus but that differs genetically from known 
LAV-1 viruses to a much larger extent than the known 
LAV-1 viruses differ from each other. The new virus 
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is basically characterized by th- mwA . 

' u DY tn - cDN A sequence whirh is 

shown in Figures 7A to 7i *nA 

w ZO /I( and this new virus is 

hereinafter generally referred to as "LAV - 

5 ' A1S ° ln *«°-'dance with this invention 

vaunts of the «„ virus are provided. The R „ As ^ 
these variants and the related cONAs derived fro* said 
RNAs are hybridizable to corresponding parts of the cDNA 
ot LA "W The DNA 01 the "=» virus also is orovided as 

, «11 as DNA fragments derived th.r.fro. hybridizabie 
with the genomic rna of iav such DNA ^ ^ 

fragments particularly consisting of the cDNA or cONA 

fragments of LAV or of r *r-^u- 

MAL or 0i recombinant DNAs containing 
such cDNA or cDNA fragments. 

DNA recombinants containing the DNA or DNA 
fragments of; LAV^ or its variants are also provided ' 
It is or course understood that fragments which would 
include some deletions or mutations which would not 
substantially .iter their capability of also hybridizing 
with the retroviral genome of LAV^ dre to be CQnside _ 
red as terming obvious equivalents of the DNA or DNA 
fragments referred to hereinabove. 

Cloned probes are further provided which can 
be made starting from any DNA fragment according to the 
invention, as are recombinant DNAs -containing *uch 
fragments, particularly any Plasmids ' amplifiable in 
procaryotic or eucaryotic cells and carrying said 
fragments. Using cloned DNA containing a DNA fragment of 
LAV MAL as a molecular hybridization probe - either by 
marking with radionuclides or with fluorescent 
reagents - LAV virion RNA may be detected directly for 
example, in blood, body fluids and blood products (e g 
m antihemophilic factors such as Factor vin concentra- 
tes) . a suitable method for achieving such detection 
comprises immobilizing LAV^ on a support a n __ 

.rocelluiose filter), disrupting the virion and 
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hybridizing with a labelled (radiolabeled or "cold- 
fluorescent- or enzyme-labelled) probe. Such an anproach 
has already been developed for Hepatitis B virus in De 
rxpheral blood (according to Scotto J. et al. Hematology 
(1983), 1, 379-384). * 9V 

Probes according to the invention can also be 
used for rapid screening of genomic DNA derived from the 
tissue of patients with LAV related symptoms to see if 
the proviral DNA or RNA present in their tissues i s 
related to LAV^ . A method whlch can bg ^ ^ ^ 

screening comprises the following steps : extraction of 
DNA rrom tissue, restriction enzyme cleavage of sa^d 
DNA, electrophoresis of the fragments and Southern 
blotting of genomic DNA from tissues and subsequent 
hybridization with labelled cloned LAV provival DNA ■ 
Hybridization' in ^ can also be used. Lymphatic fluids 
and tissues and other non-lymphatic tissues of humans 
primates and other mammalian species can also 
screened to . see if other evolutionary related retrovi- 
ruses exist. The methods referred. to hereinabove can be 
used, although hybridization and washings would be done 
under non-stringent conditions. 

The DNA according to the invention can be used 
also for achieving the expression of LAV viral antigens 
for diagnostic purposes, as well as for the production 
of a vaccine against LAV. Fragments of particular 
advantage in that respect will be discussed- .later . The 
methods which can be used are multifold : 

a) DNA can be transfected into mammalian cells 
with appropriate selection-markers by a variety of tech- 
niques, such as calcium phosphate precipitation 
polyethylene glycol, protoplast-fusion, etc. 

b) DNA fragments corresponding to genes can be 
cloned mto expression vectors for £. ^ , yeast or 
mammalian cells and the resultant proteins purified 



5 

. O The provival DNA can be "shot-gunned" 
(fragmented) into procaryotic expression vectors to 
generate fusion polypeptides. 

Recombinants, producing antigenically competent fusion 
proteins, can be identified by simply screening the 
recombinants with antibodies against LAV antigens 
Particular reference in this respect is made to those 
portions of the genome of LAV^ which, in the figures, 
are shown to belong to open reading frames and which 
encode the products having the polypeptide backbones 
shown . 

Different polypeptides which appear in figures 
7A to 71 are still further provided. Methods disclosed 
in European application 0 178 978 and in PCT application 
PCT/EP 85/00548, filed Oct. 18, 1985, are applicable for 
the production of such peptides from LAV . In this 
regard, polypeptides are provided containinfsequences 
in common with polypeptides comprising antigenic deter- 
minants included in the proteins encoded and expressed 
by the LAV MAL genome. Means are also provided for the 
detection of proteins of LAV^, particularly for the 
diagnosis of AIDS or P re-AIDS or, to the contrary, for 
the detection of antibodies against LAV or its 

proteins, particularly in patients af f licte«f with AIDS 
or pre-AIDS or more generally in asymtomatic carriers 
and in blood-related products. Further provided are 
immunogenic polypeptides and more particularly 
protective polypeptides for use in the preparation of 
vaccine compositions against AIDS or related syndroms. 

Yet further provid'ed are polypeptide fragments 
having lower molecular weights and having peptide 
sequences or fragments in common with those shown in 
figures 7A to 71. Fragments of smaller sizes can be 
obtained by resorting to known techniques, for instance, 
by cleaving the original larger polypeptide by enzymes 
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capabe.of cleaving it at spedfk s . tes By 
examples may be mentioned the enzyme of Si^Oa^a^ 
SMiSus V8, a-chymotrypsine, "mouse sub-maxillary , Und 
Protease- marketed by the Boehringer company. vibrio 

collagenase. 

specrfrcaUy recognizes the peptides oiy-Pro. Gly-Ala 

etc . ' 

Other features of this invention win ap?sar 
in the foUowin, disclosure of data obtained starts, 
f r0; " LAV MAL ' in relation to the drawings 



BRIEF nrsrprpT TQM nv TUE nRtuj M^g 

- Figs. 1A and ,3 provide comparative restriction maos 
of the genomas of U v as compared to LAV ( App U - 
cants- related new LAV virus whrch is th.«bj.et of 
their copending application, filed herewith) and LAV 
(a Known LAV isolate deposited at the Collect^ 
Natronaie des Cultures de Micro-organismes (hereinaf^ 
CNCM" ) of the Pasteur Institute. Parrs, France und'eV 
No. 1-232 on July 15, 1983) ; 

Fig. • 2 shows comparative maps setting forth the 
relative positions of the open reading . frames of the 
above genomas ; 

- Figs. 3A-3F (also designated generally hereinafter 
fig. 3") indicate the relative correspondance between 
the proteins (or glycoproteins, encoded bytheooen 
readxng frames, whereby amino acid residues of protein 
sequences of LAV^ are vertical alignment ^ 

corresponding amino acid residues (numbered) of 
corresponding or homologous proteins or glycoproteins of 
LAV BRU' 33 wel1 as LAV ELI and ARV 2. 

-Figs. 4A-4B (also designated generally hereinafter 
fig. 4") provide tables quantising the sequence 
divergence between homologous proteins of LAV LAV 



BRU' EL I 



and LAV^ L . 



- Fig. 5 shows diagrammatically the degree of divergence 
of the different virus envelope proteins ; 

- Figs. 6A and 6B ("Fig. 6" when consulted together, 
render apparent the direct repeats which apoear in the 
proteins of the different AIDS virus isolates. 

7A-7I show the full nucleotide sequences of 



- Figs 
LAV MAL 



DETAILED PESCRI PTTON OF THF tm^j j^, 

CHARACTERISTIC* AXD MOLECULAR CLOWIJJG OF AM 
AFRICAN ISOLATE. 

The different AIDS virus isolates concerned 
designated by three letters of the patients name, 



are 
LAV 



BRU 



referring to the prototype AIDS virus isolated in 
1983 from a French homosexual patient with LAS and 
thought to. have been infected in the USA in the prece- 
ding years [Barre-Sinoussi et a_l. , 1983]. LAV was 
recovered in 1985 from a 7-year old boy from^ire 
Related LAV^ was recovered in 1983 from a 24-year old 
woman with AIDS from Zaire. Recovery and purification of 
the LAV MAL virus were performed according to the method 
disclosed in European Patent Application 84 401834/138 
667 filed on September 9, 1984. 

LAV MAL is in distinguishable from the previous- 
ly characterized isolates by its structural and biologi- 
cal properties In virus metabolic labelling and 
immune precipitation by patient MAL sera, as well as 
reference sera, showed that the proteins of LAV had 
the same molecular weight (hereinafter "MV" ) as^ and 
cross-reacted immunologically with those of, prototype 
AIDS virus (data not shown) of the LAV-1 class. 

Reference is again made to European Applica- 
tion 178 978 and International Application PCT/EP 
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«/00548 as concerns the purification, 
sequencing procedures used herein, see also h e 
discussion under the heading "Experimental Procedure- 
and Significance of the Figures" hereinafter. 
\ Primary restriction enzyme analysis of LAV 

genome was done by southern blot with total dna d.riv^ 
fro. acutely infected lymphocytes, using cloned LAV 
complete genome as probe. Overall cross-hybridizaJon 
-as observed under stringent conditions, but the res- 
triction prome of the Zairian isolate was clearly 
different. Phage lambda clones carrying the complete 
viral genetic information were obtained and further 
characterized by restriction ma PP1 n, and nucleotide 
sequence analysis. A clone (hereinafter •„-„,,", was 
obt, lned by complete Hindi!! restriction of DNA from 
MV^-xnfected cells, taking advantage of the existence 
of a unique Hindlll site in the Ion, terminal repeat 
(hereinafter " LTR " J . M -„, , is thus probaMy ^ 

unintegrated viral DNA qinro tK , . 

since that species was at l^ast 

ten times more abundant than integraded provirus. 

Figure 13 gives a comoaraison of the restric- 
tion maps derived from the nucleotide sequences of 
AV ELI' LA Vl and Prototype ^AV t as well as from 
three other Zairian isolates (hereinafter " Z 1 " "22" 
and -Z3- respectively) previously mapped for' seven 
restriction enzymes [Benn et al., 1985], Despite this 
l^ted number, all of the profiles are clearly 

different (out of th* = 

or the 23 sites making up the map of 

LAV BRU' onl y seven are present in all six maps presen- 
ted), confirming the genetic polymorphism of the AIDS 
virus. No obvious relationship is apparent between the 
f-e zairian maps, and all of their common sites are 
also found in LAV 

BRU * 

Conservation of the genetic organization. 

The genetic organization of LAV^ as deduced 



9 



from the complete nucleotide sequences of its cloned 
genome is identical to that found in other isolates 
i.e., 5'gag-pol-central region-env-F3 ' . Most noticeable 
15 thS conse ™ation of the "central region" (fig 2 ) 
located between the pol and env genes, which is composed 
of a series of overlapping open reading frames 
(hereinafter "orf") previously designated Q, R, s T 
and U in the ovine lentivirus visna [Sonigo et al ' 
1985]. The product of orf S (also designated -tat") is 
implicated in the transactivation of virus expression 
[Sodroski et al., 1985 ; Arya et al., 1985] ■ the 
biological role of the product of orf Q (also designated 
"sor- . or "orf A") is still unknown [Lee et al 1 98 6 • 
Kang et al . , 1 9Q6] . 0 f the three other orfs, R, T , and 
V. only orf R is likely to be a seventh viral gene, for 
the following reasons : the exact conservation of its 
relative position with respect to Q and S (fig 2 ) the 
ponstant presence of a possible splice acceptor and' of a 
consensus AUG initiator codon, its similar codon usage 
with respect to viral genes, and finally the fact that 
the variation of its protein sequence within the dif- 
ferent isolates is comparable to that of gag, pol and Q 
(see fig. 4 ) . 

Also conserved are the sizes^f the U3 f iTand 
U5 elements of the LTR (data not shown), the location 
and sequence of their regulatory elements such as TATA 
box and AATAAA polyadenylation signal, and their 
flanking sequences, i. e ., primer binding 

(hereinafter " PBS " ) complementary to 3' end of tRNA LYS 
and polypurine tract (hereinafter "PPT"). Most of the 
genetic variability within the LTR is located in the 5 ■ 
half of U3 (which encodes a part of orf F) while the 3' 
end of U3 and R, which carry most of the cis-acting 
regulatory elements, promoter, enhancer and 

trans-activating factor receptor [Rosen et al , 1985] 
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a* well as the US element, are well-conserved 

. Overall, it clearly appears that this Zairian 

isolate, LAV^, i s the same type of retrovirus ^ 

previously se.uenced isolates of African or European 
origin. * 

Variability of the wixia proteiM 

th. u„ 0eSPUe th6ir idSntical W»««c organization, 
the la Veli and u, |i[ shows 5ubstantial di „ erences in 

the priory structure of their proteins. The amino acid 
sequences of LA^ and LAV Mffi proteins are presented 
i^ure, 3A-3F. aligned with those of LAV and ARV 2. 
*heir divergence was quantified as the Rentage of 
anuno acids substitutions in two-by-two ali,„ Mn ts <Fig 
«>. The number of insertions and deletions that had to 
be introduced in each of these align-ents has also been 
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Three general observations can be made First 
the protein sequences of the w and LAV axe aore ' 
divergent fro* LAV ^ than are those Qf HTLV-3 and ARV 2 
4A) ; similar results are obtained if ARV 2 is 
taken as reference (not shown ). The range of genetic 
Polymorphism between isolates of the AIDS virus is 
considerably greater than previously observed. Second 
our two sequences confirm that the envelope is more 
variable than the gag and pol genes. Here again, the 
relatively small difference observed between the env of 
LAV BRU and H TLV-3 appears as an exception. Third, the 
mutual divergence of the LAV^ and LAV (Fig .. 4B) is 
comparable to that between LAV^ and eifher of them; as 
far as we can extrapolate from only three sequenced 
isolates from the USA and Europe and two ( LAV and 
LA W> f *°» Africa, this . is indicative of a wider 
evolution of the AIDS virus in Africa. 

«"« P O] = Their W~ter degree of conservation 
compared to the envelope is consistent with their 
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encoding important structural or enzymatic activities. 
Of the three mature gag proteins, the P 25 which was the 
first recognized immunogenic protein of LAV [Barre- 
Sinoussi et al . , 1983] is also the better conserved 
(fig. 3). in gag and pol, differences between isolates 
are principally due to point mutations, and only a small 
number of insertional or deletional events is observed. 
Among these, we must note the presence in the over- 
lapping part of gag and pol of LAV BR[J of an insertion of 
12 amino acids (AA) which is encoded by the second copy 
of a 36 bp direct repeat present only in this isolate 
and in HTLV-3. This duplication was omitted because of a 
computing error in the published sequence of LAV 
(position 1712, Wain-Hobson et al., 1985) but was indeed 
present in the HTLV-3 sequences [Ratner et al . , 1985 ; 
Muesing et al . , 1985] . 

env : Three segments can be distinguished in the enve- 
lope glycoprotein precursor [Allan et al., 1985 ; 
Montagnier et al., 1985 ; DiMarzoVeronese et al . , 1985]. 
The first is the signal peptide (positions 1-33 in fig. 
3), and its sequence appears as variable ; the second 
segment (pos. 34-530) forms the outer membrane protein 
(hereinafter "OMP- or "g P 110') and carries most of the 
genetic variations, and in particular almost all of the 
numerous reciprocal insertions and deletions ; the third 
segment (531-877) is separated from the OMP by a poten- 
tial cleavage site following a constant basic stretch 
(Axg-Glu-Lys-Arg) and forms the transmembrane protein 
(hereinafter " TMP " or "gp 41") responsible for the an- 
chorage of the envelope" glycoprotein in the cellular 
membrane. A better conservation of the TMP than the OMP 
has also been observed between the different murine 
leukemia viruses (hereinafter "MLV" ) [Koch et al . , 1983] 
and could be due to structural constraints. 

From the alignment of figure 3 and the 
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graphical representation of the envelope variability 
shown in figure 5, we clearly see the existence of 
conserved domains, with little or no genetic variation 
and hypervariable domains, in which even the alignment 
of the different sequences is very difficult, because of 
the existence of a large number of mutations and of 
reciprocal insertions and deletions. We have not 
included the sequence of ' the envelope of the HTLV-3 
isolate since it so close to that of LAV (c f. f ig . 
4), even in the hypervariable domains, tha^it did not 
add anything to the analysis. While this graphical 
representation will be refined by more sequence data, 
the general profile is already apparent, with three 
hypervariable domains (Hyl, 2 and 3) all being located 
in the OMP and separated by three well-conserved 
stretches (residues 37-130, 211-289, and 488-530 of fig. 
3 alignment) probably associated with important biolo- 
gical functions . 

In spite of the extreme genetic variability, 
the folding pattern of the envelope glycoprotein is' 
probably constant. Indeed the position of virtually all 
of the cysteine residues is conserved within the diffe- 
rent isolates (fig. 3 and 5), and the only three varia- 
ble cysteines fall either in the signal peptide or in 
the very C-terminal part of the TMP . The hypervariable 
domains of the OMP are bounded by conserved cysteines, 
suggesting that they may represent loops attached to the 
co«mon folding pattern. Also the calculated hydropathic 
profiles [Kyte and Doolittle, 1982] of the different en- 
velope proteins are remarkably conserved (not shown). 

About half of the potential N-glycosylation 
sites, Asn-X-Ser/Thr, found in the envelopes of the 
Zairian isolates map to the same positions in LAV 
(17/26 for LA Veli and 17/28 for LAV^ , . Tne other sitfs 
appear to fall. within variable domains of env, 
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Resting the exists, of differences in the extent o£ 
enve lope glycosylate between di«„. nt isolates 
Mh«T Viral prnt , lM : of the three other identified 
5 v„.i proteins, the p27 encoded 

[Allan et .1., 1985b] is the .„ t varUMe ( t ^ 

Proteins encoded by orfs 0 and s of the central region 

S " r ^ theil ab " nCe ° f '"••"ion./d.ltW 

surpriwngly. a high frequency of amino acid, substitu- 

, t«„.. comparable to that observed in env, is found for 
the product of orf s (trans-activatin, factor,. On the 
ether hand, the protein encoded by orf Q is no more 
varrable than gag. Also noticeable ^ 

vanatron of the proteins encoded by the central regions 
of LAV^ and LA V(ftL . . 

With the availability of the complete nucleo- 
t.de sequence fro, five independant isolates, some 
general features of the AIDS virus' genetic variability 
are „ ow emerging. Firstly, its principal cause is point 

mutations which very often result 

^ en resu -l-t m ammo acid substi- 
tutions and which => ra * 

wnicn are more frequent in the 3* part of 

the genome (orf s, env and orf r, . Like all RNA viruseSf 
the retroviruses are thought to be highly subject to 
mutations caused by errors of the RNA polymerases during 
their replication, since there is no -proof reading"" of 

this step [Holland et al . , 1 98 2 • „k' 

• Stemhauer and 

Holland, 1986]. 

Another source of genetic diversity i s 
insertions/deletions. From the figure 3 alignments, in- 
sertion^ events seem _tp.be implicated in most of the 
cases, since otherwise deletions should have occurred in 
xndependant isolates at precisely the same locations 
Furthermore, upon analyzing these insertions, we have 
observed that they most often represent one of the two 

copies of a direct reneat rf,-„ c , „ 

repeat (fig. 6 ). Some are perfectly 

conserved like the 36 bp repeat in +k- - 

«*» repeat in the gag-pol overlap 
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Of LAV, 



r mu (fig. 6-a) ; others carry point mutations 
resulting in amino acid substitutions, and as a conse- 
quence, they are more difficult to observe, though 
clearly present, in the hypervariable domains of env 
,(cf. fig. 6-g and -h) . As noted for point mutations, env 
gene and orf F also appear as more susceptible to that 
form of genetic variation than the rest of the genome. 
The degree of conservation of these repeats must be 
related to their date of occurrence in the analyzed 
sequences : the more degenerated, the more ancient A 
very recent divergence of LAV BRU and HTLV-3 is suggested 
by the extremely low number of mismatched AA between 
their homologous proteins. However, one of the LAV 
repeats (located in the Hyl domain of env, fig. e-f)^ 
not present, in HTLV-3, indicating that this generation 
of tandem repeats is a rapid source of genetic diver- 
sity. We have found no traces of such a phenomenon, even 
when comparing very closely related viruses, such as the 
Mason-Pfizer monkey virus (hereinafter -MPMV ) [Sonigo 
et al., 1986], and an immunosuppressive simian virus 
(hereinafter "SRV-1") [Power et al., 1986]. Insertion or 
deletion of one copy of a direct repeat have been occa- 
sionally reported in mutant retroviruses [Shimotohno and 
Temin,. 1981 ; Darlix, 1986], but the extent to which we 
observe this phenomenon is unprecedented. The molecular 
basis of these duplications is unclear, but could be the 
"copy-choice" phenomenon, resulting from the diploidy of 
the retroviral genome [Varmus and Swanstrom, 1984 ; 
Clark and Mak, 1983]. During the synthesis of the first- 
strand of the viral DNA, jumps are known to occur from 
one RNA molecule to another, especially when a break or 
a stable secondary structure is present on the template; 
an inaccurate re-initiation on the other RNA template 
could result in the generation (or the elimination) of a 
short direct repeat. 
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Genetic variability and subsequent antigenic 
modifications have often been developed by micro- 
organisms as a means for avoiding the host's immune 
response, either by modifying their epitopes during the 
course of the infection, as in trypa.nosomes [Borst and 
Cross, 1982], or by generating a large repertoire of 
antigens, as observed in influenza virus [Webster et 
al., 1982]. As the human AIDS virus is related to animal 
lentiviruses [Sonigo et al . , 1985 ; Chiu et al., 1985], 
its genetic variability could be a source of antigenic 
variation, as can be observed during the course of the 
infection by the ovine lentivirus visna [Scott et al 
1979 ; Clements et al., 1980] or by the equine infec- 
tious anemia virus (hereinafter "EIAV" ) [Montelaro et 
al., 1984]. However, a major discrepancy with these 
animal models is the extremely low, and possibly 
nonexistant, neutralizing activity of the sera of 
individuals infected by the AIDS virus, whether they are 
healthy carriers, displaying minor symptoms, or 
afflicted with AIDS [Weiss et al . , 1985 ; Clavel et al . , 
1985]. Furthermore, even for the visna virus the exact 
role of antigenic variation in the pathogenesis is 
unclear [Thormar et al . , 1983 ; Lutley et al . , 1983]. We 
rather- believe- that, genetic variation represents a 
general selective advantage for lentiviruses by allowing 
an adaptation to different environments, for example by 
modifying their tissue or host tropisms. In the particu- 
lar case of the AIDS virus, rapid genetic variations are 
tolerated, especially in the envelope. This could allow 
the virus to become adapted to different -micro-environ- 
ments" of the membrane of their principal target cells, 
namely the T4 lymphocytes. These "micro-environments' 
could result from the immediate vicinity of the virus 
receptor to polymorphic surface proteins, differing 
either between individuals or between clones of 



lynphocytes . 
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t i Since the proteins of most of the isolates are 

antigenically cross-reactive, the genotypic differences 
do not see. to affect the sensitivity of actual diagnos- 
tic tests, based upon the detection. of antibodies to the 
AIDS virus and using purified virions as antigens They 
nevertheless have to be considered for the development 
of the "second-generation- tests, that are expected to 
be more specific, and will use smaller synthetic or 
genetically-engineered viral antigens. The identifi- 
cation of conserved domains in the highly immunogenic 
envelope glycoprotein and the core structural proteins 
(gag) is very important for these tests. The conserved 
stretch found at the end of the OMP and the beginning of 
the TMP (490-620, f ig . 3 ) could be a good candidate 
since a bacterial fusion protein containing this domain 
was well-detected by AIDS patients ■ sera [chang et al 
1985]. ' 

The envelope, specifically the OMP , mediates 
the interaction between a retrovirus and its specific 
cellular receptor [DeLarco and Todaro, 1976 . Robinson 
et al., 1980].. m the case of the AIDSvirus, ^ j^tlfl 
binding assays have shown the interaction of the enve- 
lope glycoprotein gpii 0 with the T4 cellular surface 
antigen [McDougal etal., 1986], already thought to be 
closely associated with the virus receptor [Klatzmann et 
al " 1984 ; Da * leish «t_al., 1984]. Identification of 
the AIDS virus envelope domains that are responsible for 
this interaction (receptor-binding domains) appears to 
be fundamental for understanding of the host-viral 
interactions and for designing a protective vaccine 
since an immune response against these epitopes could 
Possibly elicit neutralizing antibodies. As the AIDS 
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virus receptor is at l east partly formed of a constant 
structure, the T4 antigen, the binding site of the 
envelope is unlikely to be exclusively encoded by 
domains undergoing drastic genetic changes between 
isolates, even if these could be implicated in some kind 
of an "adaptation". One or several of the conserved 
domains of the OMP (residues 37-130, 211-289, and 
488-530 of fig. 3 alignment), brought together by the 
folding of the protein, must play a part in the virus- 
receptor interaction, and this can be explored with 
synthetxc or genetically-engineered peptides derived 
from these domains, either by direct binding assays or 
indirectly by assaying the neutralizing activity of 
specific antibodies raised against them. 



Zaire and the neighbouring countries of 
Central Africa are considered as an area endemic with 
2q the AIDS virus infection, and the possibility that the 
virus has emerged in Africa has became a subject of 
intense controversy (see Norman, 1985). From the' present 
study, it is clear that the genetic organization of 
Zairian isolates is the same as that of american 
25 isolates, thereby indicating a common ..origin . The „.ery 
important sequence differences observed between the 
proteins are consistent with a divergent evolutionary 
process. In addition, the two African isolates are 
mutually more divergent than the American isolates 
3q already analyzed ; as far as that observation can be 
extrapolated, it suggests a longer evolution of the 
virus in Africa and is also consistent with the fact 
that a larger fraction of the population is exposed than 
in developed countries. 

35 A novel human retrovirus with morphology and 

biologocal properties (cytopathogenicity , T4 tropism) 



similar to those of LAV, but nevertheless clearly 
genetically and antigenically distinct from it was 

recently isolated from two ' na *,- 

xrom two patients with AIDS 

. originating from Guinea Bissau, West-Africa [Clavel et 
al., 1986]. In neighboring Senegal, the population was 
seemingly exposed to a retrovirus also distinct from LAV 
but apparently non-pathogenic [Sarin et al 1935 . 
Kanki et al., 198 6].. Both of these novel African retro- 
viruses seem to be antigenically related to the simian 
T-cell lymphotropic virus (hereinafter "STLV-IH") shown 
to be widely present in healthy African green monkeys 
and other simian species [Kanki et al. 1985] This 
raises the possibility of a large group of African 
primate Antiviruses, . ranging from the apparently 
non-pathogenic simian viruses to the LAV-type viruses ' 
Their precise relationship will only be known after 
their complete genetic characterization, but it is 
already very likely that they have evolved from a common 
Progenitor. The important genetic variability we have 
observed between isolates of the AIDS virus in Central 
Africa is probably a hallmark of this entire group and 
may account for the apparently important genetic 
divergence between its members (loss of 

cross-antigenicity in the envelopes), m this sense-,- the 
conservation of the tropism for the T4 lymphocytes 
suggests that it is a major advantage aquired by these 
retroviruses. 

EXPERTMFflTftj, p pocrnm^ 
Virus isolation 

LAV MAL was isolated from the peripheral blood 
lymphocytes of the patient as described [Barxe-Sinoussi 
et al., 1983]. Briefly, the lymphocytes were fractiona- 
ted and co-cultivated with Phytohaemagglutinin-stimula- 
ted normal human lymphocytes in the presence of 
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interleukin 2 and anti-alpha interferon serua vi-a! 
production was assessed by ceil-free reverse 
transcriptase (hereinafter -RT-) activity assay in the 
cultures and by electron microscopy, 
ftolecmlax clooiog 
\ 4 Normal donor lymphocytes were acutely infected 

(10 c P m of RT activity/10 6 cells, as described [Barre- 
S.noussi et al., 1983], and total DNA was extracted at 
the beginning of the RT activity peak. A lambda library 
using the L47-1 vector [Loenen and Brammar, 1982] was 
constructed by partial Hindlll digestion of the DNA as 
already described [Alizon et al . , 1984]. DNA from 
xnfected cells was digested to completion with Hindlll 
and the 9-10kb fraction was selected on 0 8 % low 



melting point agarose gel and ligated into L47-1 Hindlll 
arms. About 2.10 5 plaques for LAV obtained by in 

vitifi packaging (Amersham), were plated on ^ cj^U LA101 



and screened In ^ under stringent conditions , using 
the 9 kb Saci insert of the clone lambda J19 [Alizon et 
al., 1984] carrying most of the LAV BRu genome as probe 
Clones displaying positive signals were plaque-ourif ied 
and propagated on ^ aU C600 recBC, and the recombi- 
nant phage M-H11 carrying the complete genetic 

information of LAV oa« ^ . . , 

lav mal „as further characterized- by 

restriction mapping. 

Nucleotide sequence strategy 

Viral fragments derived from M-H11 were 
sequenced by the dideoxy chain terminator procedure 
[Sanger et al . , 1977] after "shotgun" cloning in the 
M13m P 8 vector [Messing and Viera, 1982] as previously 
described [Sonigo et al., 1 985] . The viral genome of 
is 9229 nucleotides long as shown in figs. 7A-7I. 



LAV 



MAL 



- - — - --'•■a- . ' / ± . 

Each nucleotide of LAV wa^ H^-f-^-rm-i * 

fft L as det e^mined from more than 

5 independent clones on average. 

SIOIIFICAPCE OF THE FIGURES 
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Figw* 1 contains an analysis of AIDS virus isolates, 
showing: 

A/ Restriction maps of the inserts of phage 
lambda clones derived from cells infected with LAV 
(hereinafter "E-H12") and with LAV ( „- H11) . *J 

schematic genetic organization of the AIDS virus has 
been drawn above the maps. The LTRs are indicated by 
solid boxes. Restriction sites are indicated as follows- 
A: Aval; B : BamHI ; Bg:BgIII; E:EcoRI; H:HindIII. 

He: Hindi; K:KpnI; N:NdeI; P:PstI ; S:SacI; and X:XbaI 
Asterisks indicate the Hindin cloning sites in lambda 
L47-1 vector. 

B/ A comparison of the sites for seven 
restriction enzymes in six isolates : the prototype AIDS 
Virus LAV BRU< LAV MAL and ™ EU ; and Z1, Z2 and Z3. 
Restriction sites are represented by the following 
symbols vertically aligned win the symbols in fig. 1 A: 
Bglll; * :EcoRI; V :H incII; y : Hindlll ; : Kpnl ; O : Ndel ; 
and o : Sad . 

Figure 2 shows the genetic organization of the central 
region in AIDS virus isolates, stop codons in each phase 
are represented as vertical bars. Vertical arrows indi- 
cate possible AUG initiation codons. Splice acceptor (A) 
and donor (D) sites identified in subgenomic viral -oRNA 
[Muesing et al . , 1985] are shown below the graphic of 
LAV BRU , and corresponding sites in LAV £LI and LAV are 
indicated. PPT indicates the repeat of the polypurine 
tract flanking the 3 ' LTR . As observed in LAV 
[Wain-Hobson et al . , 1985], the PPT is repeated 256 
nucleotides 5' to the end of the pol gene in both the 

LAV ELI and LAV MAL s^uences, but this repeat is 
degenerated at two positions in LAV . 

Figure 3 shows an alignment of the protein sequences of 
^ four AIDS virus isolates. Isolate LAV BRU [Wain-Hobson et 
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al., 1985] is taken as reference ; only differences with 
~" 7 BRU axe noted tox ARV 2 [Sanchez-Pescador et al . , 



LAV, 



1985] and the two Zairian isolates LAV and LAV A 
minimal number of gaps (-) were introduced ^the 
alignments. The NIL, -termini of P 25 gag and P 18 gag are 
indicated [Sanchez-Pescador, 1985], The potential 
cleavage sites in the envelope precursor [Allan et al . , 
1985a ; diMarzoVeronese, 1985] separating the signal 
Peptide (hereinafter "SP"), omp and IMP are indicated as 
vertical arrows ; conserved cysteines are indicated by 
black circles and variable cysteines are boxed. The one 
letter code for each amino acid is as follows: A: Ala ; 
C:Cys ; DrAsp ; E:Glu ; F:Phe ; G : Gly ; H :His • I:H e '. 
K:Lys ; L : Leu ; M : Met ; NrAsn ; P : Pro ; Q : Gin ; R :A rg ;' 
S:Ser ; T:Thr ; V:Val; W:Trp ; Y:Tyr. 

Figure 4 shows a quantitation of the sequence divergence 
between homologous proteins of different isolates. Part 
A of each table gives results deduced from two-by-two 
alignments using the proteins of LAV BR[J as reference, 
part B, those of LAV^ as reference. Sources: Muesing 
et al., 1985 for HTLV-3 ; Sanchez-Pescador et al., 1985 
for ARV 2 and Wain-Hobson et al . , 1985 for LAV ' . For 
each case in. the tables, the size in amino acids^f the 
Protein (calculated from the first methionine residue or 
from the beginning of the orf for pol) is given at the 
upper left part. Below are given the number of deletions 
(left) and insertions (right) necessary for the align- 
ment. The large numbers in bold face represent the 
percentage of amino acids .substitutions (insertions/de- 
letions being excluded). Two by two alignments were done 
with computer assistance [Wilburg and Lipman, 1983], 
using a gag penalty of 1 , K-tuple of 1 , and window of 
20, except for the hypervariable domains of env, where 
the number of gaps was made minimum, and which are 
essentially aligned as shown in fig. 3. The sequence of 
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the predicted protein encoded by orf R of HTLV-3 has not 
been compared because of a premature termination relati- 
ve to all other isolates. 

Figure 5 shows the variability of the AIDS virus 
envelope protein. For each position x of the alignment 
of say. (Fig. 3), variability V(x) was calculated as: 
V(x) = number of different amino-acids at position x/ 
frequency of the most abundant amino acid at position x. 
Gaps in the alignments are considered as another amino 
acid. For an alignment of 4 proteins, V(x) ranges from 1 
(identical AA in the 4 sequences) to 16 (4 different 
AA). This type of representation has previously been 
used in a compilation of the AA sequence of immunoglo- 
bulins variable regions [Wu and Kabat, 1970]. Vertical 
arrows indicate the cleavage sites ; asterisks represent 
potential N-glysosylation sites (N-X-S/T) conserved in 
all three four solates ; black triangles represent 
conserved cysteine residues. Black lozanges mark the 
three major hydrophobic domains: OMP, TMP and SP; and 
the hypervariable domains: Hyl , 2 and 3. 

Figure 6 shows the direct repeats in the proteins of 
different AIDS virus isolates. These examples are 
derived from the aligned sequences of gag (a, b), F 
(c,d) and env- (e, f, g, h) shown in figure 3. The~two 
elements of the direct repeat are boxed, while degene- 
rated positions are underlined. 

Figures 7A-7I show the complete cDNA sequence of LAV 
of this invention. t9kL 

The invention.. thus pertains more specifically 
to the proteins, glycoproteins and other polypeptides 
including the polypeptidic structures shown in the 
figures 1-7. The first and last amino acid residues of 
these proteins, glycoproteins and polypeptides carry 
numbers computed from a first amino acid of the 
open-reading frames concerned, although these numbers do 
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not correspond exactly to those of the LAV proteins 
concerned, rather to the corresponding proteins of the 
LAV BRU or sequences shown in figs. 3A, 3B and 3C. Thus a 
number corresponding to a "first amino acid residue" of 
a LAV MAL P rote in corresponds to the number of the first 
amino-acyl residue of the corresponding LAV prot ein 
which, in any of f igs . 3A , 3B or 3C, is L direct 
alignment with the corresponding first amino acid of th« 



LAV 



MA 



L protein. Thus the sequences concerned can be read 
from figs. 7A-7I to the extent where they do not appear 
with sufficient clarity from Figs. 3A-3F. 

The preferred protein sequences of this 
invention extend between the corresponding -first' and 
-last" amino acid residues. Also preferred are the 
protein(s)- or glycoprotein ( s ) -portions including part 
of the sequences which follow : 

OMP or g P 1lO proteins, including precursors : 
1 to 530 

OMP or gp110 without precursor : 
34-530 

Sequence carrying the TMP or g P 41 protein : 
531-877, particularly 
680-700 

well conserved stretches of OMP : " 

37-130, 
211-289 and 
488-530 

well conserved stretch found at the end of the OMP and 
the beginning of TMP : 

490-620. 

Proteins containing or consisting of the 'well 
conserved stretches" are of particular interest for the 
production of immunogenic compositions and (preferably 
in relation to the stretches of the ^ protein) of 
vaccine compositions against the LAV- 1 viruses. 
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• The invention concerns more particularly all 
the DNA fragments whic „ have more spec . fi 

referred to in the drawings and which correspond to open 
5 reading frames, it win be understood that one skiiled 
i* the art »iH be able to obtain the. all. for instance 
by. cleaving an entire DNA corresponding to the complete 
genome of LAV^, such as by cleavage by a partial or 
complete digestion thereof with a suitable restriction 
enzyme and by the subsequent recovery of the relevant 
fragments. The dna disclosed above can be resorted to 
also as a source of suitable fragments. The techniques 
disclosed in PCT application for the isolation of the 
fragments which can then be included in suitable 
Plasmids are applicable here too. of course, other 
methods can be used, some of which have been exemplified 
xn European Application No. 178,978, filed September ,7 
1985. Reference is for instance made to the following 
methods : 

a) DNA can be transfected into mammalian cells 
wxth appropriate selection markers by a variety of tech- 
niques, such as calcium phosphate precipitation, 
Polyethylene glycol, Protoplast-f usion, etc. 

b) DNA fragments corresponding to genes can be 
cloned into expression vectors for E.^oH, yeast" or 
mammalian cells and the resultant proteins purified. 

O The provival DNA can be "shot-gunned" 
(fragmented) into procaryotic expression vectors to 
generate fusion polypeptides. Recombinants, producing 
antrgenically competent fusion proteins, can be identi- 
fied by simply screening the recombinants with anti- 
bodies against LAV antigens. 

The invention further refers to DNA recombi- 
nants, particularly modified vectors, including any of 
the preceding DNA sequences adapted to transform corres- 
ponding microorganisms or cells, particularly eucaryotic 
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cells such as yeasts, for instance 5a^a^n^ 
Cerev i si^ , or higher eucaryotic cells, particularly 
cells of mammals, and to permit expression of said DNA 
sequences in the corresponding microorganisms or cells 
General methods of that type have been recalled in the 
abovesaid PCT international patent aplication PCT/EP 
85/00548, filed October 18, 1985. 

More particularly the invention relates to 
such modified DNA recombinant vectors modified by the 
abovesaid DNA sequences and which are capable of trans- 
forming higher eucaryotic cells particularly mammalian 
cells. Preferably, any of the abovesaid sequences are 
Placed under the direct control of a promoter contained 
m said vectors and recognized by the polymerases of 
saxd cells, such that the first nucleotide codons 
expressed correspond to the first triplets of the 
above-defined DNA sequences. Accordingly, this invention 
also relates to the corresponding DNA fragments which 
can be obtained from the genome of LAV^ or its cDNA by 
any appropriate method. For instance, such a method 
comprises cleaving said LAV^ gen0 me or its cDNA by 
restriction enzymes preferably at the level of restric- 
tion sites surrounding said fragments and close to the 
opposite extremities respectively thereof, recovering 
and identifying the fragments sought according to sizes, 
if need be checking their restriction maps or nucleotide 
sequences (or by reaction with monoclonal antibodies 
specifically directed against epitopes carried by the 
polypeptides encoded by said DNA fragments), and further 
if need be, trimming the extremities of the fragment, 
for instance by an exonucleolytic enzyme such as Bal3l', 
for the purpose of controlling the desired nucleotid- 
sequences of the extremities of said DNA fragments or, 
conversely, repairing said extremities with Klenow 
enzyme and possibly ligating the latter to synthetic 
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polynucleotide fragments designed to permit th 
^constitution of the nucleotide extremities of said 
fragments . Those fragments may then be inserted in any 
5 of said vectors for causing the expression of the 
corresponding polypeptide by the cell transformed there- 
with. The corresponding polypeptide can then be recove- 
red from the transformed cells, if need be after ly sis 
thereof, and purified by methods such as electrophore- 
, sis. Needless to say, all conventional methods for 
performing these operations can be resorted to. 

The invention also relates more specifically 
to cloned probes which can be made starting from any DNA 
fragment according to this invention, thus to recombi- 
nant DNAs containing such fragments, particularly any 
Plasmids amplifiable in procaryotic or eucaryotic cells 
and carrying said fragments. Using the cloned DNA 
fragments as a molecular hybridization probe - either by 
labelling with radionucleotides or with fluorescent 
reagents - LAV virion RNA may be detected directly in 
the blood, body fluids and blood products (e.g. of the 
antihemcphylic factors such as Factor vm concentrates) 
and vaccines (e.g., hepatitis B vaccine). It has already 
been shown that whole virus can be detected in culture 
supernatant* of. LAV producing cells. A suitable method 
for achieving that detection comprises immobilizing 
varus on a support (e.g., a nitrocellulose filter), dis- 
rupting the virion and hybridizing with labelled 
(radiolabeled or "cold" fluorescent- or enzyme-label- 
led) probes. Such an approach has already been developed 
for Hepatitis B virus in peripheral blood [SCOTTO J. et 
al. Hepatology (1983), 3, 379-384]. 

Probes according to the invention can also be 
used for rapid screening of genomic DNA derived from the 
tissue of patients with LAV related symptoms, to 3ee if 
the proviral DNA or RNA present in host tissue and other 
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tissues can be related to that of LAV 

. . u . M&L' 

A method which can be used for such screening 
comprises the following steps : extraction of DNA from 
5 tissue, restriction enzyme cleavage of said DNA 
electrophoresis of the fragments and Southern blotting 
of . genomic DNA from tissues, subsequent hybridization 
with labelled cloned LAV proviral DNA . Hybridization In 
situ can also be used. 

5 Lymphatic fluids and tissues and other non- 

lymphatic tissues of humans, primates and other 
mammalian species can also be screened to see if other 
evolutionnary related retrovirus exist. The methods 
referred to hereinabove can be used, although hybridi- 
zation and washings would be done under non-stringent 
conditions . 

The DNAs or DNA fragments according to the 
invention can be used also for achieving the expression 
of viral antigens of LAV^ for diagnostic purposes. 

The invention relates generally to the poly- 
peptides themselves, whether synthesized chemically, 
isolated from viral preparations or expressed by the 
different DNAs of the invention, particularly by the 
ORFs or fragments thereof in appropriate hosts, par- 
ticularly procaryotic or eucaryotic hosts .after trans- 
formation thereof with a suitable vector previously 
modified by the corresponding DNAs. 

More generally, the invention also relates to 
any of the polypeptide fragments (or molecules, particu- 
larly glycoproteins having the same polypeptide back- 
bone as the Polypeptides mentioned hereinabove) bearing 
an epitope characteristic of a protein or glycoprotein 
° f LAV MAL ' which Polypeptide or . molecule then has 
N-terminal and C-terminal extremities respectively 
either free or, independently from each other, cova- 
lently bonded to amino acids other than those which are 
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norma ly associated with the, in the larger polypeptides 
ox glycoproteins of the LAV virus, which last mentioned 
amxno acids are then free or belong to another polypep- 
tide sequence. Particularly, the invention relates to 
hybrxd polypeptides containing any of the epitope- 
bearxng-polypeptides which have been defined more speci- 
fically hereinabove, recombined with other polypeotides 
fragments normally -foreign to the LAV proteins, having 
srzes sufficient to provide for an increased immunogeni- 
cs of the epitone-bearing-polypeptide yet, said 
foreign polypeptide fragments either being immunogeni- 
cally lne rt or not interfering with the immunogenic 
properties of the epitope-bearing-polypeptide . 

Such hybrid polypeptides, which may contain 
from 5 up to J 50, even 250 amino acids, usually consist 
of the expression products of a vector which contained 
Ah Aiuiia a nucleic acid sequence expressible under th- 
control of a suitable promoter or replicon in a suitable 
host, whxch nucleic acid sequence had however beforehand 
been modified by insertion therein of a DNA sequence 
encoding said epitope-bearing-polypeptide. 

Said epitope-bearing-polypeptides, particular- 
ly those whose N-terminal and C-texminal amino acids are 
free, are also accessible by chemical synthesis accord- 
ing to technics well known in the chemistry of proteins. 

The synthesis of peptides in homogeneous 
solution and in solid phase is well known. In this 
respect, recourse may be had to the method of synthesis 
xn homogeneous solution described by Houbenweyl in the 
work entitled 'Methoden der Organischen Chemie' (Methods 
of Organic Chemistry) edited by E. WUNSCH. , vol 15-1 
and II, THIEME, Stuttgart 1974. This method of synthesis 
consists of successively condensing either the 
successive amino acids in twos, in the appropriate order 
or successive peptide fragments previously available or 
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formed * and containing already several amino-acyl 
residues in the appropriate order respectively. Except 
for the carboxyl and aminogroups which will be engaged 
in the formation of the peptide bonds, care must be 
taken to protect beforehand all other reactive groups 
borne by these amino-acyl groups or fragments . However 
prior to the formation of the peptide bonds, the 
carboxyl groups are advantageously activated, according 
to methods well known in the synthesis of peptides 
Alternatively, recourse may be had to coupling reactions 
bringing into play conventional coupling reagents, for 
instance of the carbodiimide type, such as 

1-ethyl-3-(3-dimethyl- am ino-propyl)-carbodiimide. when 
the amino acid group used carries an additional amine 
group (e.g., lysine) or another acid function (e g 
glutamic acid), these groups ma y be protected by 
carbobenzoxy or t-butyloxycarbonyl groups, as regards 
the amine groups, or by t-butylester groups, as regards 
the carboxylic groups, similar procedures are available 
for the protection of other reactive groups for 
example, an -SH group (e.g., in cysteine) can be 
protected by an acetamidomethyl or paramethoxybenzyl 



group. 



In the case of a progressive, synthes. 



5J.S, amino 

acid by amino acid, the synthesis starts preferably with 
the condensation of the C-terminal amino acid with the 
amino acid which corresponds to the neighboring amino- 
acyl group in the desired sequence and so on, step by 
step, up to the N-terminal-amino acid. Another preferred 
technique which can be used is that described by r.d. 
Merrifield in "Solid Phase Peptide Synthesis' [J. Am. 
Chem. soc, 45, 2149-2154]. m accordance with the 
Merrifield process, the first C-terminal amino acid of 
the chain is fixed to a suitable porous polymeric resin, 
by means of its carboxylic group, the amino group of the 



io group 
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amino acid then being protected, for example by a 
t-butyloxycarbonyl group, when the first C-terminal 
ammo acid is thus fixed to the resin, the protective 
group of the amine group is removed by washing the resin 
with an acid, i.e., trif luoroacetic acid, when the 
Protective group of the amine group is a t-butyloxycar- 
bonyl group. Then, the carboxylic group of the second 
ammo acid, which . is to provide the second amino-acyl 
group of the desired peptidic sequence, is coupled to 
the deprotected amine group of the C-terminal amino acid 
fixed to the resin. Preferably, the carboxyl group of 
this second amino acid has been activated, for example 
by dicyclohexyl-carbodiimide, while its amine group has 
15 been protected, for example by a t-butyloxycarbonyl 
group. The first part of the desired peptide chain, 
which comprises the first two amino acids, is thus 
obtained. As previously, the amine group is then de- 
protected, and one can further proceed with the fixing 
2Q of the next amino-acyl group and so forth until the 
whole peptide, sought is obtained. The protective grouos 
of the different side groups, if any, of the peptide 
cham so formed can then be removed. The peptide sought 
can then be detached from the resin, for example by 
25 means of hydrofluoric acid, and finally recovered in 

pure form from the ar-iri <,~i,.j.- 

rne acid solution according to 

conventional procedures . 

As regards the peptide sequences' of smallest 
size bearing an epitope or 'immunogenic determinant, and 
more particularly those which are readily accessible by 
chemical synthesis, it-^y be required, in order to 
increase their in ^ immunogenic character, to couple 
or "conjugate" them covalently to a physiologically 
acceptable and non-toxic carrier molecule. By way of 
35 examples of carrier molecules or macromolecular supports 
which can be used for making the conjugates according to 
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the invention can be mentioned naturai proteins, such as 
tetany toxoid, ovalbumin, serua-albu.ins. heaocyanins 
•tc. Synthetic macromolecular carriers, for exaapie 
5 polyxa.. or Poly ( D- 1 -alanine,-po ly(1 - lysine)3 , ^ 
-ed too. other types of eacroaolecuiar carriers that 
can be used, which generally have molecular weights 
higher than 20.000, are known f xon the literature . The 
conjugates can be synthesized by known processes such as 
, are described by Fr a„t Z and Robertson in -Infect, and 
Immunity", 33, 1 9 3- 1 p ft noan j ,_ 

.. ' 98 (1981) and h y P-E. Kauffman in 

Applied and Environmental Microbiology, October 1981 
Vol 42, No. 4, pp. 611 . 614 . For instancef ^ f 

couplxng agents can be used : gl utaric aldehyde> 
chloroformate, water-soluble carbodii m ides 3uch 
as( N - ethyl . N . (3 . dimethylamino _ propyi) carbodiim . de/ 

HC1), diisocyanates, bis-diazobenzidine , di- and 
tnchloro-s-triazines, cyanogen broBides 

benzaguinone, as well as the coupling agents mentioned 
m 'Scand. J. Immunol.', 1978. vol. 8, pp. 7 _ 23 
(Avrameas, Ternynck, Guesdon). 

Any coupling process can be used for bonding 
one or several reactive groups of the peptide, on the 
one hand, and one or several reactive groups of the 
carrier, on the other hand. Again coupling i 3 advanta- 
geously achieved between carboxyl and amine groups 
carried by the peptide and the carrier or vice-versa in 
the presence of a coupling agent of the type used in 
Protein synthesis, e.g., 1-ethyl-3- (3-dimethylaminopro- 
pyD-carbodiimide, N-hydroxybenzotriazole, etc. Coupling 
between amine groups respectively borne by the peptide 
and the carrier can also be made with glutaraldehyde, 
for instance, according to the ffie thod described by 
BOQUET, P. et al. (1982) Molec. Immunol., 13_, 1441-1549 
when the carrier is hemocyanin. 

The immunogenicity of epitope-bearing-peptides 
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can also be reinforced by oligomerisation thereof, for 
example in the presence of glutaraldehyde or any other 
suitable coupling agent. In particular, the invention 
relates to the water soluble immunogenic oligomers thus 
obtained, comprising particularly from 2 to 10 monomer 
units . 

The glycoproteins, proteins and other polypep- 
tides (generally designated hereinafter as "antigens" of 
this invention) whether obtained by methods, such as are 
disclosed in the earlier patent applications referred to 
above, in a purified state from LAV vixus prepara- 
tions or - as concerns more particularly the peptides - 
by chemical synthesis, are useful in processes for the 
detection of the presence of anti-LAV antibodies in 
biological media, particularly biological fluids such as 
sera from man or animal, particularly with a view of 
possibly diagnosing LAS or AIDS. 

Particularly the invention relates to an in 
vitr^ process of diagnosis making use of an envelope 
glycoprotein or of a polypeptide • bearing an epitope of 
this glycoprotein of LAV^ for the detection of anti _ 
LAV antibodies in the serums of persons who carry them 
Other polypeptides - particular those carrying an epito- 
pe of a core protein - can be used too.— 

A preferred embodiment of the process of the 
invention comprises : 

- depositing a predetermined amount of one or several of 
said antigens in the cups of a titration microplate ; 

- introducing increasing^ dilutions of the biological 
fluid, to be diagnosed (e.g., blood serum, spinal fluid, 
lymphatic fluid, and cephalo-rachidian fluid), into 
these cups ; 

- incubating the microplate ; 

- washing carefully the microplate with an appropriate 

buffer ; 
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- adding into the cups specific labelled antibodies di- 
rected against blood immunoglobulins and 

- detecting the antigen-antibody-complex formed, which 
is then indicative of the presence of LAV antibodies in 
the biological fluid. 

\ ■ Advantageously the labelling of the anti-immu- 

noglobulin antibodies is achieved by an enzyme selected 

from among those which are capable of hydrolysing a 
substrate, which substrate undergoes a modification of 
its radiation-absorption, at least within a 
predetermined band of wavelenghts. The detection of the 
substrate, preferably comparatively with respect to a 
control, then provides a measurement of the potential 
risks, or of the effective presence, of the disease. 

•Thus, preferred methods of immuno-enzymatic 
and also immunof luorescent detections, in particular 
according to the ELISA technique, are provided. 
Titrations may be determinations by immunofluorescence 
or direct or indirect immuno-enzymatic determinations. 
Quantitative titrations of antibodies on the serums 
studied can be made. 

The invention also relates to the diagnostic 
kits themselves for the in vitro detection of antibodies 
against the LAV virus, which kits comprise any of. the 
polypeptides identified herein and all the biological 
and chemical reagents, as well as equipment, necessary 
for peforming diagnostic assays. Preferred kits comprise 
all reagents required for carrying out ELISA assays. 
Thus preferred kits will include, in addition to any of 
said polypeptides, suitable buffers and anti-human immu- 
noglobulins, which anti-human immunoglobulins are label- 
led either by an immunof luorescent molecule or by an 
enzyme. In the last instance, preferred kits also com- 
prise a substrate hydrolysable by the enzyme and provid- 
ing a signal, particularly modified absorption of a 
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radiation, at least in a determined wavelength, which 
signal is then indicative of the presence of antibody in 
the biological fluid to be assayed with said kit. 

It can of course be of advantage to use 
several proteins or polypeptides not only of LAV • , but 
also of LAV ai together with homologous proteins^or po- 
lypeptides of earlier described viruses, such as LAV 
HTLV-3, ARV 2, etc. 31 U ' 

The invention also relates to vaccine composi- 
tions whose active principle is to be constituted by any 
of the antigens, i.e., the hereinabove disclosed poly- 
peptides of LAV MAL , particularly the purified g P iio or 
immunogenic fragments thereof, fusion polypeptides or 
oligopeptides in association with a suitable pharmaceu- 
tical^ or physiologically acceptable carrier. A first 
type of preferred active principle is the g P 110 
immunogen of said immunogens. Other preferred active 
principles to be considered in that fields consist of 
the peptides containing less than 250 amino acid units, 
preferably less than 150, particularly from 5 to 150 
amino acid residues, as deducible for the complete 
genome of LAV^ and even more preferably those peptides 
which contain one or more groups selected from Asn-X-Thr 
and Asn-X-Ser - as defined above . Preferred peptide's" for 
use in the production of vaccinating principles are 
peptides (a) to (f) as defined above. By way of example, 
there may be mentioned that suitable dosages of the 
vaccine compositions are those which are effective to 
elicit antibodies in vivo., in the host, particularly a 
human host. Suitable, doses range from 10 to 500 
micrograms of polypeptide, protein or glycoprotein per 
kg, for instance. 50 to 100 micrograms per kg. 

The different peptides according to this in- 
vention can also be used themselves for' the production 
of antibodies,, preferably monoclonal antibodies specific 
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for the respective different peptides. For the produc- 
tion of hybridomas secreting said monoclonal antibodies- 
conventional production and screening methods can be 
used. These monoclonal antibodies, which themselves are 
part of the invention, provide very useful tools for the 
identification and even determination of relative pro- 
portions of the different polypeptides or proteins in 
bzologxcal samples, particularly human samples contain- 
ing LAV or related viruses . 

The invention further relates to the hosts 
(Procaryotic or eucaryotic cells) which are transformed 
by the above mentioned recombinants and which are capa- 
ble of expressing said DNA fragments. 

Finally the invention also concerns vectors 
for transforming eucaryotic cells of human origin, par- 
ticularly lymphocytes, the polymerase of which are 
capable of recognizing the LTRs of LAV . Particularly 
said vectors are characterized by the presence of a LAV 
LTR therein,, said LTR being then active as a promoter 
enabling the efficient transcription and translation in 
a suitable. host of a DNA insert coding for a determined 
protein placed under its controls. 

Needless to say, the invention extends to all 
variants of genomes and corresponding DNA fragments 
(ORFs) having substantially equivalent properties , all 
of said genomes belonging to retroviruses which can be 
considered as equivalents of LAV^. it must be under- 
stood that the claims which follow are also intended to 
cover all equivalents of the products (glycoproteins, 
polypeptides, DNAs, etc.)- whereby an equivalent is a 
product, e.g., a polypeptide, which may distinguish from 
a product defined in any of said claims, say through one 
or several amino acids, while still having substantially 
the same immunological or immunogenic properties. A 
similar rule of equivalency shall apply to the DNAs , it 
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being understood that the rule of equivalency will then 
; be tied to the rule of equivalency pertaining to the 
polypeptides which they encode. 

It will also be understood that all the 
literature referred to hereinbefore and hereinafter and 
all patent applications and patents not specifically 
identified herein but which form counterparts of those 
specifically designated' herein, must be considered as 
incorporated herein by reference. 

It should further be mentioned that the 
invention further relates to immunogenic compositions 
that contain preferably one or more of the polypeptides 
which are specifically identified above and which have 
the amino acid sequences of LAV^ that have been 
identified, or peptidic sequences corresponding to 
previously defined LAV proteins. In this respect, the 
invention relates more particularly to the particular 
polypeptides which have the sequences corresponding more 
specifically to the LAV^ sequences which have been 
referred to earlier, i.e., the sequences extending 
between the following first and last amino acids, of the 
LAV BRU P^ins themselves, i.e., the polypeptides 
having sequences contained in the LAV^ 0 MP or LAV 
TMP or .sequences, extending over both, particularly tfios^ 
extending from between the following positions of the 
amino acids included in the £ny. open reading frame of 
the I-AV^y genome, 

1-530 
34-530 
and more preferably 

531-877, particularly 660-700, 
37-130 
21 1-289 
488-530 
490-620. 
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These different sequences can be used for any 
of the above defined purposes and in any of the compo- 
sitions which have been disclosed. 

Finally the invention also relates to the 
different antibodies which can be formed specifically 
against the different peptides which have been disclosed 
herein, particularly to the monoclonal antibodies which 
recognize them specifically. The corresponding hybri- 
domas- which can be formed starting from spleen cells 
Previously immunized with such peptides which are fused 
with appropriate myeloma cells and selected according to 
standard procedures also form part of the invention. 

Phage a clone E-H12 derived from LAV 
infected cells has been deposited at the CNCM under No 
1-550 on May 9, 1986. Phage clone M-H1 1 derived from 
infected cells has been deposited at the CNCM 



LAV 



MAL 



under No. 1-551 on May 9, 1986 
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Claims : 

in, i. .1 V " US " V «AL ""Pricing RNA correspond- 

mg to the cDNA of fi gs . 7A-7I. 

; 2. The cDNA of figs. 7A-7I. 

claim , 3 ' A recombinant comprising the cDNA of 

4. A probe containing a nucleir 
u , . j . * "uneic acid sequence 

. 5. A method for identifying the presence in a 

obt hT ^ ^ VlrUS WhlCh C ° mPriSeS ^bridizing RNA 
obtained from said tissue with said probe of claim 4 

6. The method of claim 5, wherein said probe 
can hybridize with RNA from said LAV virus to 

tify said LAV^ L virus. ^ L t0 ^ 

7. a P^tide or fragment thereof whose amino 
acid sequence is encoded by an open reading frame of a 
cDNA sequence of the LAV^ virus of claiffl , 

8. ' The peptide of claim 7 encoded by a cDNA 
sequence from amino-acyl residue 37 to amino-acyl 
rescue 130, or from amino-acyl residue 211 to 
amino-acyl residue ?aq * 

V residue 289, or from amino-acyl residue 488 

to amino-acyl residue 530 of fi gs . 3A . 3F and 7A _ ?I> 

9- The peptide of claim 7 encoded by a cDNA 
sequence from amino-acyl residue 490 to amino-acyl 
residue 620 or. from amino-acyl residue 680 to amino-acyl 
residue 700 of figs. 3A-3F and 7A-7I. 

10. The peptide of claim 7 which comprises a 
Protein or glycoprotein whose amino acid sequence is 
encoded by all or part of one of the following cDNA 
sequences of figs. 3A-3F and~7A-7l: 

0MP or gpno proteins, including precursors- 

1 to 530; 

. 0MP or gpno without precursor: 34-530; and 
TMP or g P 41 protein: 531-877. 

11. The peptide of claim 10 encoded by all 
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or part of one of the following cDNA sequences of figs 
3^ and .-I: 3 7 -no, 21l - 289 , 488 _ 530 , 490 . 6 ' 20or 

5 12. A method for the in v^iifl detection of the 

Presence of an antibody directed against a LAV virus in 
a, human body fluid, which uprises: contacting said 
body fluid with an ant . gen obtained from sa . d ^ 

MAL of 1. said antigen consisting of a peptide 

, or a ■ fragment thereof whose amino acid sequence is 
encoded by an open reading frame of a cDNA sequence of 
figs. 7A-7I; and then detecting the immunological 
reaction between said antigen and said antibody. 

13. The method of claim 12 wherein said 
antigen detects said LAV^ virus of claim ^ 

14. The method of claim 12 which comprises the 

steps of: 

a) depositing a predetermined amount of said antigen 
into a cup of a titration microplate; 

b) introducing increasing dilutions of said body fluid 
into said cup; 

c) incubating said microplate; 

d) washing the microplate with a buffer; 

e) adding into said cup a labelled antibody directed 
against blood immunoglobulins; and then_ 

f) determining "whether an antigen-antibody-complex" has 
formed in said cup which is indicative of the presence 
of a LAV antibody in said body fluid. 

15. A diagnostic kit for ■ the in xilxa de- 
tection of antibodies against a LAV virus, which kit 
comprises: an antigen consisting of a peptide of claim 

16. The kit of claim 15 wherein the antigen 
consists of a peptide of said LAV virus of claim 1, 
encoded by the open reading frame of a cDNA sequence of 
said LAV MAL virus. 
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1.7. An immunogenic composition comprising, an 
antigen of the LAV^ virus of claiffl , or an ^ 

Peptxde ox fragment thereof encoded by RNA of sa.id 
varus; and a physiologically acceptable carrier. 
■ 18. The immunogenic composition of claim 17 

wherean said peptide is the gP no env.inn- •. 

yp,lu e nvelope glycoprotein 

or a fragment thereof. 

19. The immunogenic composition of claim 17 
wherexn the peptide comprises a protein or glycoprotein 
hwose am.no acid sequence is encoded by all or part of 
one of the follQwing cDna sequences ^ ^ 

/A-7I : 

OMP or gpno proteins, including precursors- 

1 to 530; 

OMP or gpno without precursor: 34-530; and 
TMP or gp41 : 531-877. 

20. The composition of claim 19 wherein the 
Protein or glycoprotein is encoded by all or part of one 
of the following cDNA sequences of Figs. 3A-3F and 
7A-7I: 37-130, 211-289, 488-530, 490-620 or 680-700. 

21. An antibody formed against a peptide of 

claim 7. 

22. A cell transformed with a DNA recombinant 
of claim 3. 
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